News Archive

SDSC Enhances Zone SRB, Comprehensive Data Management System

Refinements in Version 3.2 Support Large-Scale Scientific Collaborations

Published 07/26/2004

The San Diego Supercomputer Center (SDSC) at UC San Diego has released version 3.2 of Zone SRB, the SDSC Storage Resource Broker (SRB) scientific data management system. Version 3.2 offers faster command line performance, support for the Informix database in the SRB Metadata Catalog (MCAT), faster file transfers for users inside firewalls, and numerous improvements in installation, administration, and the SRB server. The software, user manuals, and release notes for version 3.2 are online for download by the research community as a source distribution at http://www.sdsc.edu/DICE/SRB/tarfiles/main.html.

"The SRB is one of the most advanced and comprehensive production tools available for scientific data management," said Reagan Moore, SDSC Distinguished Scientist and co-director of the Data and Knowledge Systems (DAKS) program at SDSC. "The SRB system is used in scientific disciplines from astronomy and environmental sciences to neurosciences, physics, and chemistry, and in projects across agencies including NARA, NSF, NIH, DOE, and NASA as well as many international efforts."

As an end-to-end data management solution, the SRB stores data created in sensor networks or simulations, supports data management and collaboration in data grids, publication in digital libraries, and long-term preservation in digital archives. The flexible Zone SRB system can manage data from simple collections for a single researcher to complex multi-terabyte collections. By supporting the federation of distributed data collections, Zone SRB allows scientific users to flexibly share rapidly-changing data collections across multiple institutions that may be spread around the globe, each running their own SRB Metadata Catalog (MCAT), or zone. This provides speedy access to "any data, anywhere, anytime."

As they use the SRB, researchers realize that they are doing more than simply learning a new software application - they are mastering the basic principles of sound scientific data management, which is essential knowledge for scientists in today's technology-driven world.

"At first we were just planning a minor SRB release," said Arcot Rajasekar, director of SDSC's DAKS Data Grids Technologies group. "But the team worked very hard and so many improvements were ready that we are able to release version 3.2 with many important new features."

What's New in SRB Version 3.2

A widely-used production tool, the evolution of the SRB is guided by input from its large user community. The new capabilities that have been added to version 3.2 fall into the following categories: based on the requirements of the Compact Muon Solenoid (CMS) high-energy physics project, which sometimes needs to move very large numbers of small files, a new Shell has been added that executes command-line Scommands more quickly, improving performance. To meet the need of the Biomedical Informatics Research Network (BIRN) project for efficient parallel file transfers to and from environments within firewalls, client-initiated connections for parallel I/O have been added. In addition, developer Mike Smorul of Joseph Jaja's University of Maryland NASA Earth Science Information Partners (ESIP) project completed a port of the Informix database to the Metadata Catalog (MCAT) component of the SRB, adding another database option for SRB users. Improvements have been made to the core SRB Server component, which can more intelligently select an appropriate resource based on the availability of sufficient space and other criteria. The SRB Server can also be put into a maintenance mode for graceful shutdown. In addition to bug fixes, new features have also been added for easier installation and administration.

In conjunction with the release of SRB version 3.2, new versions of inQ, the popular Windows graphical user interface to SRB, and MySRB, the web-based access tool, are being released that support SRB 3.2. In addition, new versions of Jargon (Java API for Real Grids On Networks), the SRB Java API, and SDSC Matrix have also been released. Matrix can be used as either a grid workflow management system or for SRB Web Services, and uses the Data Grid Language (DGL) to communicate between Matrix Web Service clients and the server.

New features in Matrix 3.2 include the ability to create data-flows over a very large number of files/data sets; the ability to initiate in parallel a new dataflow based on the number of files, which increases execution speed even for non-parallel applications; provenance tracking of gridflows for all data and processes during and after execution; a developer-friendly Java client API; declaration of scoped variables within the workflow to help in dynamic computational steering based on previous results; and a DGL developer guide.

SDSC SRB version 3.2 is supported on a wide variety of systems. The MCAT Metadata Catalog runs on Oracle, IBM DB2, Sybase, Informix, MySQL, and Postgres. The SRB Server runs on Microsoft Windows NT, 2000,and XP, as well as most UNIX platforms including Linux, IRIX, AIX, HP Tru64, and Mac OSX, and supports data in file systems, tape stores such as the High Performance Storage System (HPSS), and databases. In addition to UNIX clients , additional APIs include C and C++ library calls, Shell commands, Perl and Python load libraries, dynamic load libraries for Windows, Open Archives Interface, WSDL, and Java classes . Interactive browser interfaces include the Windows graphical user interface, inQ, and the Web interface, MySRB.

The SRB team, led by Reagan Moore and Arcot Rajasekar, includes chief architect Mike Wan, senior developer Wayne Schroeder, data grid application specialist George Kremenek, SRB administrator Sheau-Yen Chen, and data grid developers Charles Cowart, Lucas Gilbert, Arun Jagatheesan, Roman Olschanowsky, Antoine de Torcy, Tim Warnock, and Bing Zhu. -Paul Tooby

Related Links

SDSC SRB V3.2 information and download - http://www.sdsc.edu/DICE/SRB/tarfiles/main.html
User Guide for SRB - http://www.npaci.edu/SRB/
SDSC Matrix - http://www.npaci.edu/DICE/SRB/matrix/Software/index.html
Overview article on Zone SRB V3.0 - http://www.sdsc.edu/News Items/PR1107031.html
SDSC Data and Knowledge Systems (DAKS) program - http://www.sdsc.edu/daks/